26 research outputs found
On Separate Normalization in Self-supervised Transformers
Self-supervised training methods for transformers have demonstrated
remarkable performance across various domains. Previous transformer-based
models, such as masked autoencoders (MAE), typically utilize a single
normalization layer for both the [CLS] symbol and the tokens. We propose in
this paper a simple modification that employs separate normalization layers for
the tokens and the [CLS] symbol to better capture their distinct
characteristics and enhance downstream task performance. Our method aims to
alleviate the potential negative effects of using the same normalization
statistics for both token types, which may not be optimally aligned with their
individual roles. We empirically show that by utilizing a separate
normalization layer, the [CLS] embeddings can better encode the global
contextual information and are distributed more uniformly in its anisotropic
space. When replacing the conventional normalization layer with the two
separate layers, we observe an average 2.7% performance improvement over the
image, natural language, and graph domains.Comment: NIPS 202
A Systematic Survey of Chemical Pre-trained Models
Deep learning has achieved remarkable success in learning representations for
molecules, which is crucial for various biochemical applications, ranging from
property prediction to drug design. However, training Deep Neural Networks
(DNNs) from scratch often requires abundant labeled molecules, which are
expensive to acquire in the real world. To alleviate this issue, tremendous
efforts have been devoted to Molecular Pre-trained Models (CPMs), where DNNs
are pre-trained using large-scale unlabeled molecular databases and then
fine-tuned over specific downstream tasks. Despite the prosperity, there lacks
a systematic review of this fast-growing field. In this paper, we present the
first survey that summarizes the current progress of CPMs. We first highlight
the limitations of training molecular representation models from scratch to
motivate CPM studies. Next, we systematically review recent advances on this
topic from several key perspectives, including molecular descriptors, encoder
architectures, pre-training strategies, and applications. We also highlight the
challenges and promising avenues for future research, providing a useful
resource for both machine learning and scientific communities.Comment: IJCAI 2023, Survey Trac
Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model
Transition state (TS) search is key in chemistry for elucidating reaction
mechanisms and exploring reaction networks. The search for accurate 3D TS
structures, however, requires numerous computationally intensive quantum
chemistry calculations due to the complexity of potential energy surfaces.
Here, we developed an object-aware SE(3) equivariant diffusion model that
satisfies all physical symmetries and constraints for generating pairs of
structures, i.e., reactant, TS, and product, in an elementary reaction.
Provided reactant and product, this model generates a TS structure in seconds
instead of the hours required when performing quantum chemistry-based
optimizations. The generated TS structures achieve an average error of 0.13 A
root mean square deviation compared to true TS. With a confidence scoring model
for uncertainty quantification, we approach an accuracy required for reaction
rate estimation (2.6 kcal/mol) by only performing quantum chemistry-based
optimizations on 14% of the most challenging reactions. We envision the
proposed approach to be useful in constructing and pruning large reaction
networks with unknown mechanisms
Improving Molecular Pretraining with Complementary Featurizations
Molecular pretraining, which learns molecular representations over massive
unlabeled data, has become a prominent paradigm to solve a variety of tasks in
computational chemistry and drug discovery. Recently, prosperous progress has
been made in molecular pretraining with different molecular featurizations,
including 1D SMILES strings, 2D graphs, and 3D geometries. However, the role of
molecular featurizations with their corresponding neural architectures in
molecular pretraining remains largely unexamined. In this paper, through two
case studies -- chirality classification and aromatic ring counting -- we first
demonstrate that different featurization techniques convey chemical information
differently. In light of this observation, we propose a simple and effective
MOlecular pretraining framework with COmplementary featurizations (MOCO). MOCO
comprehensively leverages multiple featurizations that complement each other
and outperforms existing state-of-the-art models that solely relies on one or
two featurizations on a wide range of molecular property prediction tasks.Comment: 24 pages, work in progres
A new perspective on building efficient and expressive 3D equivariant graph neural networks
Geometric deep learning enables the encoding of physical symmetries in
modeling 3D objects. Despite rapid progress in encoding 3D symmetries into
Graph Neural Networks (GNNs), a comprehensive evaluation of the expressiveness
of these networks through a local-to-global analysis lacks today. In this
paper, we propose a local hierarchy of 3D isomorphism to evaluate the
expressive power of equivariant GNNs and investigate the process of
representing global geometric information from local patches. Our work leads to
two crucial modules for designing expressive and efficient geometric GNNs;
namely local substructure encoding (LSE) and frame transition encoding (FTE).
To demonstrate the applicability of our theory, we propose LEFTNet which
effectively implements these modules and achieves state-of-the-art performance
on both scalar-valued and vector-valued molecular property prediction tasks. We
further point out the design space for future developments of equivariant graph
neural networks. Our codes are available at
\url{https://github.com/yuanqidu/LeftNet}
MUBen: Benchmarking the Uncertainty of Pre-Trained Models for Molecular Property Prediction
Large Transformer models pre-trained on massive unlabeled molecular data have
shown great success in predicting molecular properties. However, these models
can be prone to overfitting during fine-tuning, resulting in over-confident
predictions on test data that fall outside of the training distribution. To
address this issue, uncertainty quantification (UQ) methods can be used to
improve the models' calibration of predictions. Although many UQ approaches
exist, not all of them lead to improved performance. While some studies have
used UQ to improve molecular pre-trained models, the process of selecting
suitable backbone and UQ methods for reliable molecular uncertainty estimation
remains underexplored. To address this gap, we present MUBen, which evaluates
different combinations of backbone and UQ models to quantify their performance
for both property prediction and uncertainty estimation. By fine-tuning various
backbone molecular representation models using different molecular descriptors
as inputs with UQ methods from different categories, we critically assess the
influence of architectural decisions and training strategies. Our study offers
insights for selecting UQ and backbone models, which can facilitate research on
uncertainty-critical applications in fields such as materials science and drug
discovery
MHub: Unlocking the Potential of Machine Learning for Materials Discovery
We introduce MHub, a toolkit for advancing machine learning in materials
discovery. Machine learning has achieved remarkable progress in modeling
molecular structures, especially biomolecules for drug discovery. However, the
development of machine learning approaches for modeling materials structures
lag behind, which is partly due to the lack of an integrated platform that
enables access to diverse tasks for materials discovery. To bridge this gap,
MHub will enable easy access to materials discovery tasks, datasets,
machine learning methods, evaluations, and benchmark results that cover the
entire workflow. Specifically, the first release of MHub focuses on three
key stages in materials discovery: virtual screening, inverse design, and
molecular simulation, including 9 datasets that covers 6 types of materials
with 56 tasks across 8 types of material properties. We further provide 2
synthetic datasets for the purpose of generative tasks on materials. In
addition to random data splits, we also provide 3 additional data partitions to
reflect the real-world materials discovery scenarios. State-of-the-art machine
learning methods (including those are suitable for materials structures but
never compared in the literature) are benchmarked on representative tasks. Our
codes and library are publicly available at https://github.com/yuanqidu/M2Hub
A meaningful exploration of ofatumumab in refractory NMOSD: a case report
ObjectiveTo report the case of a patient with refractory neuromyelitis optica spectrum disorder (NMOSD), who, despite showing poor response or intolerance to multiple immunosuppressants, was successfully treated with Ofatumumab.Case presentationA 42-year-old female was diagnosed with NMOSD in the first episode of the disease. Despite treatment with intravenous methylprednisolone, immunoglobulin, rituximab and immunoadsorption, together with oral steroids, azathioprine, mycophenolate mofetil and tacrolimus, she underwent various adverse events, such as abnormal liver function, repeated infections, fever, rashes, hemorrhagic shock, etc., and experienced five relapses over the ensuing four years. Finally, clinicians decided to initiate Ofatumumab to control the disease. The patient received 9 doses of Ofatumumab over the next 10 months at customized intervals. Her symptoms were stable and there was no recurrence or any adverse events.ConclusionOfatumumab might serve as an effective and safe alternative for NMOSD patients who are resistant to other current immunotherapies
Smoking Cessation With 20 Hz Repetitive Transcranial Magnetic Stimulation (rTMS) Applied to Two Brain Regions: A Pilot Study
Chronic smoking impairs brain functions in the prefrontal cortex and the projecting meso-cortical limbic system. The purpose of this pilot study is to examine whether modulating the frontal brain activity using high-frequency repetitive transcranial magnetic stimulation (rTMS) can improve smoking cessation and to explore the changing pattern of the brain activity after treatment. Fourteen treatment-seeking smokers were offered a program involving 10 days of rTMS treatment with a follow-up for another 25 days. A frequency of 20 Hz rTMS was sequentially applied on the left dorso-lateral prefrontal cortex (DLPFC) and the superior medial frontal cortex (SMFC). The carbon monoxide (CO) level, withdrawal, craving scales, and neuroimaging data were collected. Ten smokers completed the entire treatment program, and 90% of them did not smoke during the 25-day follow-up time. A significant smoking craving reduction and resting brain activity reduction measured by the cerebral blood flow (CBF) and brain entropy (BEN) were observed after 10 days of 20 Hz rTMS treatments compared to the baseline. Although limited by sample size, these pilot findings definitely showed a high potential of multiple-target high-frequency rTMS in smoking cessation and the utility of fMRI for objectively assessing the treatment effects